– an MT system for closely related languages
نویسنده
چکیده
The demonstration of our system addresses one very important part of the translation business – the localization of texts and programs from one source language into a group of mutually related target languages. It shows step by step a simple method of machine translation between related languages and its incorporation into an existing commercial translation aid using the concept of translation memory. It is quite clear that the localization of the same source into several typologically similar target languages, one language pair after another, is a waste of money and effort. In the translation process it is necessary to solve very similar problems for each source-target language pair. The use of one language from the target group as a pivot and to perform the translation and localization through this language seems to be quite natural solution of these problems. It is of course much easier to translate texts from Czech to Polish or from Russian to Bulgarian than from English or German to any of these languages. Introduction As part of our “pivot” language solution, we are using a combination of an MT system with a commercial machine aided translation (MAT) system. We are using the TRADOS system, although any such system will do. The system uses the concept of translation memory, which contains pairs of previously translated sentences from a source to a target language. When a human translator starts translating a new sentence, the system tries to match (with a degree of similarity set by a user) the source with sentences already stored in the translation memory. If found, the human translator decides whether to use it, to modify it or to reject it. 1 Translation Memory Integration The segmentation of the translation memory (the texts are stored as relevant pairs of source/target language sentences) is the key feature of our method. The translation memory may be exported into a text file and thus allows for an easy manipulation with its content. Let us suppose that we have at our disposal two translation memories – one human made for the source/pivot language pair and the other created by an MT system for the pivot/target language pair. The substitution of segments of a pivot language by the segments of a target language is then only a routine procedure. The human translator translating from the source to the target language then gets a translation memory for the required pair (source/target); there is no trace of the pivot language left. The system of penalties applied in TRADOS Translator’s Workbench guarantees that a previous humanmade translation present in the memory gets higher priority than the automatic translation. This method has at least three advantages: – The use of machine-made translation memory only as a resource supporting the direct human translation from the source to the target language has no negative effect on the quality of translation and from the user’s point of view. – There is no difference (except for the small difference in the quality of translation memories) when our method is used compared to the original process of working with the support of solely human-made translation memories. – The third advantage is the fact that given a sufficient quality of the MT from the pivot to the target language, our method may substantially increase the speed and reduce the costs of the translation from the source to the target languages. 2 The System SÍLKO The system !
منابع مشابه
Tagging as a Key to Successful Mt
This paper describes the key role of a stochastic morphological tagger in an MT system between very closely related languages. The MT system Česílko exploits the close relatedness of both natural languages in question (Czech and Slovak), which allows substantial simplification of the translation method used. It also uses to a great advantage the possibilities of combination of a human translati...
متن کاملControl and Cybernetics a Method of Hybrid Mt for Related Languages *
The paper introduces a hybrid approach to a very specific field in machine translation — the translation of closely related languages. It mentions previous experiments performed for closely related Scandinavian, Slavic, Turkic and Romanic languages and describes a novel method, a combination of a simple shallow parser of the source language (Czech) combined with a stochastic ranker of (parts of...
متن کاملTranslating from under-resourced languages: comparing direct transfer against pivot translation
In this paper we compare two methods for translating into English from languages for which few MT resources have been developed (e.g. Ukrainian). The first method involves direct transfer using an MT system that is available for this language pair. The second method involves translation via a cognate language, which has more translation resources and one or more advanced translation systems (e....
متن کاملA Comparison of MT Methods for Closely Related Languages: a Case Study on Czech - Slovak Language Pair
This paper describes an experiment comparing results of machine translation between two closely related languages, Czech and Slovak. The comparison is performed by means of two MT systems, one representing rule-based approach, the other one representing statistical approach to the task. Both sets of results are manually evaluated by native speakers of the target language. The results are discus...
متن کاملStructural Similarities in MT A Bulgarian-Polish case
This paper shows that although it seems relatively easy to translate between closely related languages, not every framework manages to capture important details in the argument structure. By combining methods tested for translation between Swedish and Norwegian and assuming a compact theory of argument structure, I think that we can achieve better results in an MT system that deals with Slavic ...
متن کاملTesting the Limits - Adding a New Language to an MT System
This paper deals with a problem of an application of an MT method developed for a pair of very closely related languages to a pair of languages whose degree of relatedness (and thus also the degree of similarity) is lower. The close relatedness of the original language pair (Czech and Slovak) allowed a substantial simplification of the translation method used. This paper provides an overview of...
متن کامل